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1  Introduction 


In  1974  Dennis  and  More  [3]  gave  a  characterization  of  those  quasi-Newton  methods 
for  the  nonlinear  equation  problem  which  produce  iterates  which  are  Q-superlinearly  con¬ 
vergent.  This  characterization  immediately  carries  over  to  unconstrained  optimization  by 
working  with  the  nonlinear  equation  (gradient  equal  to  zero)  that  results  from  the  first- 
order  necessary  conditions.  Similarly  the  Dennis-More  characterization  can  be  carried  over 
to  equality  constrained  optimization  by  working  with  the  nonlinear  system  corresponding 
to  the  first-order  necessary  conditions.  This  nonlinear  system,  involves  the  two  groups  of 
variables  ( x ,  y).  Here  x  is  the  vector  of  primal  variables,  and  y  is  the  vector  of  dual  variables 
corresponding  to  the  equality  constraints.  Hence  the  approach  characterizes  Q-superlinear 
convergence  in  terms  of  the  variable-pair  ( x,y ).  Indeed,  the  first  authors  to  establish  Q- 
superlinear  convergence  for  various  secant  methods  for  equality  constrained  optimization, 
Han  [8]  in  1976,  Tapia  [12]  in  1977,  and  Glad  [7]  in  1979,  did  so  using  this  approach  and 
established  Q-superlinear  convergence  in  the  pair  (x,  y).  Not  long  after,  in  1982,  Boggs, 
Tolle,  and  Wang  [1]  observed  that  under  certain  assumptions,  various  quasi-Newton  secant 
methods  for  equality  constrained  optimization  actually  give  Q-superlinear  convergence  in  the 
primal  variable  x  alone.  They  then  proceeded  to  establish  a  characterization  of  those  quasi- 
Newton  methods  that  produced  iterates  which  are  Q-superlinearly  convergent  in  the  primal 
variable  x  alone.  Nocedal  and  Overton  [9]  in  1983,  and  Fontecilla,  Steihaug,  and  Tapia  [6]  in 
1987  derived  the  Bogss- Tolle- Wang  characterization  under  less  restrictive  assumptions  than 
those  used  by  Boggs,  Tolle,  and  Wang.  Finally,  in  1987  Stoer  and  Tapia  [11]  gave  a  very 
short  and  self-contained  derivation  of  the  Boggs- Tolle- Wang  characterization. 

Recently,  there  has  been  activity  in  extending  the  successful  primal-dual  Newton  interior- 
point  method  from  linear  programming  to  general  nonlinear  programming.  In  linear  pro¬ 
gramming,  the  primal-dual  Newton  method,  although  not  initially  presented  in  this  man¬ 
ner,  is  now  recognized  as  a  damped  and  perturbed  Newton  method  applied  to  the  Karush- 
Kuhn- Tucker  (KKT)  necessary  conditions.  This  interpretation  serves  as  the  vehicle  for  their 
extension  to  nonlinear  programming.  In  1992,  El-Bakry,  Tapia,  Tsuchiya,  and  Zhang  [5] 
established  the  local  convergence  properties  of  the  Newton  interior-point  method  for  NLP  . 
These  convergence  results  are  in  line  with  those  of  the  standard  Newton’s  method.  In  1993 
Yamashita  and  Yabe  [14]  considered  quasi-Newton  interior-point  formulations  and  used  the 
Dennis-More  theory  to  derive  a  characterization  of  those  methods  which  gave  Q-superlinear 
convergence  in  all  of  the  variables.  The  KKT  conditions  involve  a  vector  of  primal  variables 
x,  a  vector  of  dual  variables  y  corresponding  to  equality  constraints,  and  a  vector  of  dual 
variables  z  corresponding  to  the  nonnegative  constraints  on  the  primal  variables  x.  Hence 
the  variables  consist  of  the  triple  (x,y,z),  and  x  and  z  are  required  to  be  nonnegative. 

We  see  from  the  Boggs-Tolle-Wang  theory  that  while  the  variables  involved  in  quasi- 
Newton  methods  for  equality  constrained  optimization  are  the  primal  variables  x  and  the 
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dual  variables  y,  it  is  possible  to  obtain  a  characterization  result  in  terms  of  the  primal 
variables  alone.  Hence,  in  some  sense  the  primal  variables  are  also  the  primary  variables. 
This  understanding  led  us  initially  to  try  to  obtain  a  characterization  in  terms  of  the  primal 
variables  x  also  for  quasi-Newton  interior-point  methods.  However,  we  could  not  do  so  with¬ 
out  including  some  undesirable  assumption  on  the  interaction  between  the  primal  variable  x 
and  the  dual  variables  z.  This,  in  turn,  led  us  to  search  for  a  characterization  in  terms  of  the 
variables  ( x,z ).  Our  search  was  successful  and  is  the  subject  of  the  current  research.  It  is 
interesting  then,  that  in  the  sense  alluded  to  above,  the  primary  variables  for  quasi-Newton 
interior-point  methods  are  the  nonnegative  variables  (x,  z). 

This  paper  is  organized  as  follows.  In  Section  2,  with  an  eye  towards  our  main  character¬ 
ization  result,  we  study  the  characterization  of  Q-superlinear  convergence  for  a  damped  and 
perturbed  quasi-Newton  method  for  the  nonlinear  equation  problem.  Our  intention  is  not  to 
give  a  complete  theory  for  the  topic,  but  to  develop  the  tools  needed  for  our  interior-point 
application.  In  Section  3,  we  describe  our  quasi-Newton  interior-point  method.  In  Section 
4,  we  derive  an  equivalence  between  our  quasi-Newton  interior-point  method  and  a  damped 
and  perturbed  quasi-Newton  method  for  a  system  of  nonlinear  equations  that  involves  only 
the  variables  (x,  z).  This  equivalence  has  the  flavor  of  the  approach  taken  by  Stoer  and 
Tapia  [11]  when  they  derived  the  Boggs- Tolle- Wang  characterization.  In  Section  5,  we  apply 
the  theory  developed  in  Section  2  to  the  equivalent  formulation  obtained  in  Section  4  and 
establish  our  main  characterization  results. 

2  Characterization  for  damped  and  perturbed  quasi- 
Newton  methods. 

In  this  section  we  formulate  and  study  a  damped  and  perturbed  quasi-Newton  method  for 
the  nonlinear  equation  problem.  Our  objective  is  to  derive  characterization  results  concerning 
Q-superlinear  convergence  that  can  be  used  to  establish  our  main  characterization  theorem 
for  quasi-Newton  interior-point  methods  in  Section  5. 

Consider  the  nonlinear  equation  problem 


F(x)  =  0  (2.1) 

where  F  :  R"  — ►  Rn.  Recall  that  the  standard  Newton’s  method  theory  assumptions  for 
problem  (2.1)  are 

51.  There  exists  x*  €  Rn  such  that  F(x *)  =  0. 

52.  The  Jacobian  matrix  F'(x *)  is  nonsingular. 
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S3.  The  Jacobian  operator  F'  is  Lipschitz  continuous  at  x*  in  an  open  convex  neighborhood 
of  x*,  with  Lipschitz  constant  7  >  0. 

As  usual  the  expressions  F*,  F*+ 1  and  F,  denote  the  evaluation  of  the  function  F  at  the 
points  zjt,  xjfe+i,  and  x*  respectively.  Similar  notation  will  be  used  for  other  quantities. 


By  a  damped  and  perturbed  quasi-Newton  method  for  problem  (2.1),  we  mean 
the  construction  of  the  iteration  sequence 

Z/fe+i  =  Xk  —  akAk~x[Fk  —  rk]  ,  fc  =  0, 1,2, ...  .  (2.2) 

In  (2.2),  0  <  ak  <  1  is  the  steplength  parameter,  rjt  €  Rn  is  a  perturbation  vector  ,  and 
Ak  is  an  approximation  to  F*. 

We  begin  by  collecting  some  known  useful  facts.  Toward  this  end  let  =  z*  —  z*  and 
Sk  =  x*+i  —  Xk\  assume  SI  -  S3,  and  that  {x*}  converges  to  z*. 

There  exists  a  constant  p  >  0  such  that  for  k  sufficiently  large 


-|M<||Ffc||<p||e,||.  (2.3) 

P 

A  proof  of  (2.3)  can  be  found,  for  example,  in  Dembo,  Eisenstat,  and  Steihaug  [2].  It 
follows  that 

w  - 0  =*•  jrl  -* 1  •  <2-4) 

INI  INI 

and 


Ih+ill 


IIA+ill 


To  establish  (2.4)  we  merely  need  to  observe  that  ek+i  =  Sk  +  e*.  Moreover,  (2.5)  follows 
directly  once  we  write 

im=imm 

llc*ll  pk\\  M  ■ 

The  next  two  theorems  will  motivate  choices  for  the  steplength  a*  and  the  perturbation 
vector  r*. 

Theorem  2.1.  Let  {x*}  be  generated  by  (2.2).  Assume  that  Si,  S2,  and  S3  hold  and  that 
Xk  —*  x* .Then  any  two  of  the  following  statements  imply  the  third: 

(i)  Xk  — >  x*  Q-superlinearly. 

0 a )  =  0. 
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(2.6) 


(tit)  limfc^  =  0. 

Proof.  Adding  and  substracting  the  appropriate  quantities,  we  have 

Fk+ 1  =  [^it+i  —  Fk  —  F,Sk\  -  [Ak  —  F^]sk  +  [a*r*  +  (1  -  a*)/*]. 


From  (2.5),  (*)  is  equivalent  to 


.imKM 

*— 00  ||3jt|| 


=  0. 


Using  Lemma  4.1.15  in  [4]  we  have 


II FM  -  Fk  -  Ksk II  <  ^M(||e1+1||  +  ||ct||).  (2.7) 

The  remainder  of  the  proof  is  fairly  straightforward. 

□ 

Observe  that  if  for  all  k  ,  o/t  =  1  and  r*  =  0,  then  (2.2)  becomes  the  standard  quasi- 
Newton  method;  moreover,  in  this  case  condition  (ii)  is  trivially  satisfied  and  Theorem  2.1 
reduces  to  the  standard  Dennis-More  characterization. 

Condition  (ii)  tells  us  that  essentially  for  Q-superlinear  convergence  we  must  have  ak  — *  1 
and  r*  =  o(||sfc||).  We  are  somewhat  concerned  with  this  latter  requirement  for  the  following 
reason.  Our  expectation  is  to  be  able  to  control  the  size  of  the  perturbation  vector  r*; 
however,  at  the  begining  of  the  iteration  when  we  must  choose  rjt,  the  step  s*  is  unknown 
to  us.  For  this  reason  we  look  for  a  similar  condition  involving  ||F*.||,  a  quantity  which  is 
readily  available.  However,  we  must  add  an  assumption  concerning  the  rate  of  convergence 
of  {xfc}. 

Theorem  2.2.  Let  {xfc}  be  generated  by  (2.2).  Assume  that  Si,  S2,  and  S3  hold  and  that 
Xk  — ►  x* .Then  any  two  of  the  following  statements  imply  the  third. 


(i) '  Xk  — *•  x*  Q-superlinearly. 

(ii) '  lim.fc_.oo  =  o  and  the  convergence  of  {xfc}  to  x*  is  Q-linear. 

(in)'  lim._00  =  0. 

Proof.  We  must  show  that  any  two  conditions  in  Theorem  2.1  are  equivalent  to  the 
corresponding  two  conditions  in  Theorem  2.2.  Observe  that  from  (2.3),  the  fact  that  Sk  = 
efc+i  —  efc,  and  the  Q-linear  convergence  of  {x*}  to  x*,  there  exist  positive  constants  /?iand  (32 
such  that  for  k  sufficiently  large 

ftl  <  1  <  Pft 2 

rtlftll  -  INI  "  11*111  ’ 
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(2.8) 


The  proof  of  the  theorem  now  follows  from  Theorem  2.1,  and  (2.8). 


□ 

The  assumption  in  ( it )'  concerning  the  rate  of  convergence  of  {x*}  can  be  replaced  by 
the  following  weaker  statement: 

The  set 

<2i*({**})  =  {  limit  points  of{  }}  , 

does  not  contain  one  and  oo,  for  at  least  one  norm. 

Clearly  the  set  <3i*({x*})  depends  on  the  norm  selected.  The  largest  element  of 
Qi*({zfc})  is  the  well-known  Qi -factor.  For  more  detail  on  this  issue,  see  Chapter  9  of 
Ortega  and  Rheinboldt  [10]. 

In  terms  of  secant  methods  the  assumption  that  {x*}  converges  to  x*  Q-linearly,  seems 
not  to  be  restrictive.  In  fact  if  the  matrices  {A*}  satisfy  a  standard  bounded  deterioration 
property,  as  do  the  well-known  secant  methods,  then  in  an  appropriate  norm,  x*  — ►  x*, 
Q-linear.  (see  Chapter  8  of  Dennis  and  Schnabel  [4]  for  more  detail  ). 

Theorem  2.2  tells  us  that  in  order  to  obtain  Q-superlinear  convergence  we  should  have 
rk  =  o(||Fjt||)  and  a*  — ►  1.  We  find  it  interesting  that  this  is  exactly  the  condition  given  by 
Dembo,  Eisenstat,  and  Steihaug  [2]  for  Q-superlinear  convergence  of  their  inexact  Newton 
method.  Actually,  they  chose  at  =  1  for  all  k.  An  obvious  choice  for  the  perturbation 
vector  is  rk  =  crfc || F*. ||  where  crk  £  (0, 1]  and  <r*  -+  0  as  k  — *  oo. 

3  Primal-dual  quasi-Newton  interior-point  method. 

In  this  section  we  formulate  a  primal-dual  quasi-Newton  interior-point  method  for 
solving  the  constrained  optimization  problem. 

minimize  /(x) 

subject  to  h(x)  =  0  (3-1) 

x  >  0 

where  /  :  R"  — >  R  and  h  :  Rn  — ►  Rm  are  twice  continuously  differentiable  functions. 

The  Lagrangian  function  associated  with  problem  (3.1)  is  given  by 

/(x,  y ,  z)  =  f(x)  +  yTh(x)  -  zTx  (3.2) 

where  y  £  Rm  ,  and  z  £  Rn  are  the  Lagrange  multipliers  associated  with  the  constraints 
h(x)  —  0  ,  and  x  >  0  respectively. 
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The  Karush-Kuhn- Tucker  (KKT)  conditions  for  problem  (3.1)  are 

(  Vxl(x,y,z)  \ 

F(x,y,z)=\  h(x)  =0,  (x,z)>0,  (3.3) 

\  XZe  } 

where  X  =  diag(x),  Z  =  diag(z)  and  e  £  R"  is  the  vector  of  all  ones. 

Observe  that  the  inequality  constraints  in  problem  (3.1),  X{  >0,  i  =  l,...,n,  can 
be  written  e,Tx  >0,  i  =  1,  ...,n  where  e,-  is  the  i  —  th  natural  basis  vector,  i.e.,  the  i  —  th 
component  is  one  while  all  other  components  are  zero.  For  x,  a  feasible  point  of  problem  (3.1), 
we  let  B(x)  =  {i  :  x,-  =  0).  As  is  usual  in  constrained  optimization  B(x)  is  the  set  of  active 
or  binding  inequality  constraints.  We  will  have  need  below  to  consider  the  gradient  of  active 
constraints.  It  should  be  clear  that  this  set  will  be  {e,  £  Rn  :  i  £  B(x)}. 

In  the  study  of  Newton’s  method,  the  standard  assumptions  for  problem  (3.1)  are 

A.l.  (Existence)  There  exists  (x*,y*,z*)  a  solution  to  problem  (3.1)  and  its  associated  La¬ 
grange  multipliers  satisfying  the  KKT  conditions  (3.3). 


A. 2.  (Smoothness)  The  Hessian  operators  V2/,  V2/it,  i  =  1,  ...,m  are  locally  Lipschitz  con¬ 
tinuous  at  x*. 


A. 3.  (Regularity)  The  set  {V/i,(x*)  :  i  =  1,  (J{e,  :  i  £  R(z*)}  is  linearly  independent. 


A. 4.  (Second-Order  Sufficiency)  For  all  tj  ^  0  satisfying  V/i,(x*)tt/  =  0,  i  —  1,  ...,m;  e,T?7  = 
0,  i  £  B(x*)  we  have  rjT'Vx2l(x*,y*,z*)T]  >  0 


A. 5.  (Strict  Complementarity)  For  all  i,  z*  +  x*  >  0. 


For  a  nonnegative  parameter  fi,  the  perturbed  KKT  conditions  associated  to  (3.3)  are 


Fn(x,y,z)  = 


/  VJ(x,y,z) 


(3.4) 


h(x )  |  =  0,  (x,z)  >  0, 

\  XZe  —  fie 

We  describe  a  primal-dual  quasi-Newton  interior- point  method  for  solving  problem  (3.1). 
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Algorithm  1.  Let  w0  =  (xQ,y0,z0)  be  an  initial  point  satisfaying  (x0,  zo)  >  0. 

For  k  =  0,1,...,  until  convergence  do 

Stepl.  Choose  <7*  €  (0, 1]  and  set  /i*  =  akRk  for  some  Rk  €  R. 

Step2.  Obtain  Awk  =  (Axk,  Ayk,  Azk)T  as  the  solution  of  the  linear  system 

MkAwk  =  -FM(wk)  (3.5) 

where 


l  Gk  Vhk  -In\ 
Mk  =  VhkT  0  0 

V  ^  o  xk  ) 


Step3.  Choose  rk  6  (0, 1)  and  set 
ak  =  min(l,Tkak) 

OCky  =  1  Or  Ctky  =  Ofc 

where 

-1  -1 
mi^X^Axk,  —  1)  ’  min(ZklAzk, -1) 

Step4.  Update 

m+ i  =  wk  +  AkAwk 

where  Ak  =  diag(ak, ...,«*,  aky,  ■■■,  ctky,  ak,  ak) 
in  above  the  three  groups  of  scalars  have  n,  m,  and  n  members  respectively. 


The  choice  for  Rk  will  be  in  general  ||F(u^)||;  however  we  leave  it  open  to  obtain  a 
certain  amount  of  needed  flexibility  in  the  statement  of  our  theorems  in  Section  5. 

The  choice  Gk  =  V2xl(wk)  corresponds  to  Newton’s  method.  For  this  choice  El-Bakry, 
Tapia,  Tsuchiya,  and  Zhang  [5]  established  local  convergence,  superlinear  convergence,  and 
quadratic  convergence  for  Algorithm  1  for  the  appropriate  choices  of  rk  and  Rk.  Yamashita 
[13]  considered  a  somewhat  different  steplength  than  that  described  in  Step  3,  this  choice 
was  based  on  a  particular  merit  function.  He  then  established  a  global  convergence  result 
for  his  line-search  algorithm.  El-Bakry  et  al  [5]  also  gave  a  global  convergence  result  for  a 
line-search  globalization  of  their  form  of  Algorithm  1.  Observe  that  the  choice  of  steplength 
in  Step  3,  ak  =  Tkcck  and  rk  €  (0, 1)  keep  x*+i  and  zk+i  positive.  If  rk  was  chosen  to 
be  equal  to  one,  then  at  least  one  component  of  xjt+i  or  zk+i  would  be  zero.  We  could 
use  different  steplengths  also  for  the  x  and  z  variables,  The  obvious  choice  would  be  to  let 
&kx  =  min(l,TkCikx),  and  akz  =  min(l,  TkCtkz),  where 

_  _ _ -1 _ 

mi^X^Axk,—!)' 
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and 


-1 

kz  min(ZklAzk,  —  1) 

Since  the  asymptotic  properties  of  these  choices  are  essentially  the  same,  we  will  not 
concern  ourselves  with  other  choices  of  steplength  parameters.  It  should  be  clear  that  the 
algorithmic  choices  are  the  choices  of  t*  ,  <Jk  ,and  Gk  the  approximation  to  W2xl(wk).  Our 
objective  is  to  characterize  Q-superlinear  convergence  in  terms  of  the  algorithmic  choices. 
A  straightforward  application  of  Theorem  2.2  would  lead  to  a  characterization  in  terms  of 
all  the  variables  ( x,y,z ).  Such  activity  would  be  incomplete  since  for  equality  constrained 
optimization,  where  the  z- variable  is  not  present,  the  Bogss-Tolle-Wang  characterization 
is  in  term  of  the  x-variable  alone.  Effectively,  the  y-variable  can  be  removed  from  the 
problem  as  demostrated  by  Stoer  and  Tapia  [11].  Our  first  initial  efforts  in  the  current 
research  attempted  to  obtain  such  a  characterization  for  Algorithm  1;  however  we  could 
not  do  so  without  making  assumptions  which  we  considered  undesirable.  Therefore,  we 
turned  to  attempting  a  characterization  in  terms  of  the  (x,  z)- variables  and  were  successful. 
It  follows  then  that  in  this  application  the  primary  variables  are  x  and  z,  each  carries 
independent  information  and  can  not  be  removed  from  the  problem.  In  retrospective  we  find 
this  occurrence  fitting  and  not  surprising. 

4  An  equivalent  formulation. 

In  this  section  we  imitate  the  approach  taken  by  Stoer  and  Tapia  [11]  in  deriving 
the  Boggs-Tolle-Wang  characterization  for  equality  constrained  optimization.  Our  task  is 
to  construct  a  quasi-Newton  method  that  involves  only  the  (x,  z)-variables,  is  equivalent 
to  Algorithm  1  of  Section  3,  and  has  the  form  of  a  damped  and  perturbed  quasi-Newton 
method  as  described  by  (2.2).  This  equivalence  will  allow  us,  in  Section  5,  to  apply  our 
characterization  Theorem  2.2. 

Assumption  A3  allows  us  to  locally,  i.e.,  in  a  neighborhood  of  x*,  consider  the  projection 
operator 

P(x)  =  /  -  V/i(x)[V/i(x):rV/i(x)]-1V/i(x)T  .  (4.1) 

In  turn  this  allows  us  to  consider  the  nonlinear  equation 

*(.,.)-(  rWWW  -  \  o.  (4.2) 

Observe  that  F0  :  R2n  — >  R2n.  We  now  demonstrate  that  Algorithm  1  is  equivalent  to 
a  damped  and  perturbed  quasi-Newton  method  applied  to  equation  (4.2).  Toward  this  end 
let  (xfc,  j/fc,  Zjt),  Gk,  and  fik  be  as  in  the  fc-th  iteration  of  Algorithm  1  and  consider  the  linear 


9 


system 


/  PkG„  +  VhkVhTk  -Pk\(Axk\  ..  .... 

(  Zk  Xk  ](  ^  (4-3) 

In  (4.3),  e  is  the  2n- vector  whose  first  n  components  are  zero  and  whose  last  n  compo¬ 
nents  are  one.  We  will  also  need  to  consider  the  formula 

Vk+  =  -  (VhTkVhk)-'VhT(Gk Ax,  +  V/,  -  (zk  +  Ax*))  ,  (4.4) 

where  (Ax*,  Azk)  is  the  solution  of  (4.3). 

Proposition  4-1.  Let  ( x*,y*,z *)  be  a  solution  of  the  KKT  conditions  (3.3)  at  which  the 
standard  assumptions  A1-A5  hold.  Then  ( x*,z *)  is  a  solution  of  the  nonlinear  equation  (4-3) 
and  the  standard  Newton’s  method  assumptions  S1-S3  hold  for  Fq  at  this  solution.  Moreover, 
if  (Axk,  Ayk,  Azk)  is  a  solution  of  the  linear  system  (3.5)  ,  then  (An,  Azk)  is  a  solution  of 
the  linear  system  (4-3).  Conversely,  if  (Aik,  Azk)  is  a  solution  of  the  linear  system  (4-3) 
and  we  let  Ay*,  =  yk+  —  yk,  where  yk+  is  given  by  (4-4) >  then  (Ax*,  Ayk,  Azk)  is  a  solution 
of  the  linear  system  (3.5)  . 

Proof.  We  begin  by  establishing  the  equivalence  between  the  linear  systems  (3.5)  and  (4.3). 

Writing  out  (3.5)  in  detail  gives 

GkAxk  +  VhkAyk  —  A  Zk  =  —  (V/fc  +  Vhkyk  —  Zk) 

VhfAxk  =  —  hk  (4.5) 

ZkAxk  +  XkAzk  =  —XkZke  +  /i*e  . 

Writing  out  (4.3)  in  detail  gives 

(PkGk  +  VhkVhTk)Axk  -  PkAzk  =  -(Pk(Vfk-Zk)  +  Vhkhk)  (,,k 

ZkAxk  +  XkAzk  =  -XkZk  +  pke.  V  } 

We  observe  that  we  can  write 

Pk[GkAxk  +  V/fc  —  (zk  +  Axfc)]  =  GkAxk  +  V  fk  —  ( Zk  +  Azk)  +  V/nt/jt+  (4.7) 

where  yk+  is  given  by  (4.4). 

Now,  suppose  (Axk,  Ay*,,  Azk)  solves  (4.5).  Multiplying  the  first  equation  by  Pk,  the 
second  equation  by  V/n,  adding  the  two  resulting  equations,  and  recalling  that  PkVhk  =  0 
leads  us  to  the  first  equation  in  (4.6).  Hence  (Axk,Azk)  solves  (4.6).  Conversely,  suppose 
(Axk,Azk)  solves  (4.6).  Multiplying  the  first  equation  by  Vhf  gives  the  second  equation 
in  (4.5).  This  in  turn  tells  us  that  the  first  equation  in  (4.6)  now  implies  that  the  left- 
hand  side  of  (4.7)  is  zero.  Hence  the  right-hand  side  is  zero  and  the  first  equation  in  (4.5) 

holds  with  yk  +  A  yk  =  yk+-  This  establishes  the  equivalence  of  the  two  linear  systems  (4.5) 
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and  (4.6). 

If  (x*,  y*,  z *)  solves  (3.3),  then  clearly  (x*,  zm)  solves  (4.2).  Observing  that  P(x)(V/(x)  — 
z)  =  P(x)(V/(x)  +  V/i(x)y+(x*,  z*)  —  z)  and  y+(x*,z*)  =  y*  we  see  that 


(4.8) 


An  argument  along  the  lines  of  the  one  given  above  can  be  used  to  show  that  the  linear 
system 


Fo'(xV)  (  £  )  =  0 


(4.9) 


is  equivalent  to  the  linear  system 


F'(x*,y*,z *)  J  r?v  |=0 

Vz 


(4.10) 


where  F  is  given  by  (3.3).  Under  the  standard  assumptions  A1-A5,  for  F  given  by  (3.3), 
we  know  that  F'(x*,y*,z*)  is  nonsingular.  Hence  F0'(x*,z*)  must  also  be  nonsingular.  It 
should  be  clear  that  F0  and  F  have  the  same  smoothess  properties.  This  says  that  assump¬ 
tions  S1-S3,  appropriately  stated,  hold  for  Fo  at  (x*,2*).  We  have  now  established  our 
equivalence  proposition. 

□ 

We  have  shown  that  obtaining  ( xk,zk )  from  Algorithm  1  can  be  viewed  as  obtaining 
(xfc,  Zk)  from  a  damped  and  perturbed  quasi-Newton  method  applied  to  the  nonlinear  equa¬ 
tion  Fq(x,z)  =  0  given  by  (4.2).  Moreover,  the  approximate  Jacobian  has  the  form 


(  PkGk  +  VhkVhl  -Pk  \ 

V  Z,  Xk  ) 

and  the  Jacobian  at  the  solution  is  given  by  (4.8). 

We  are  now  ready  to  state  our  Q-superlinear  convergence  results. 


(4.11) 


5  Q-superlinear  convergence  characterization. 

In  this  section  we  apply  the  theory  developed  in  Section  2  to  the  primal-dual  quasi- 
Newton  interior-point  method  described  by  Algorithm  1  of  Section  3.  Recall  that  Gk  is  our 
approximation  to  G*  =  V2/(x*)  -+■  V2h(x*)y* .  Also  Rk  appears  in  Step  1  of  Algorithm  1. 

Theorem  5.1.  Let  {(xk,yk,zk)}  be  generated  by  Algorithm  1.  Assume  that  {(xk,yk,  zk)} 
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converges  to  ( x*,y*,z *)  and  assumptions  A1-A5  hold  at  z*).  Furthermore,  assume 

that  T*  and  ak  have  been  chosen  so  that 

(i)  n  ->  l. 

(a)  ak  -f  o. 

Assume  that  either  Rk  =  0(||sfc||),  where  sk  =  (xk+i,yk+i,  zk+1)  -  (xk,yk,zk),  or  Rk  = 
0(\\F(xk,yk,zk)\\)  and  {(xk,yk,zk)}  converges  to  ( x*,y*,z *)  Q-linearly. 

Then  {(xk,yk,  zk)}  converges  Q-superlinearly  to  (x*,y*,z*)  if  and  only  if 

_ jKgfc  ~  G*)(Xh  + 1  -  ffc)|| _ ft  /  c  -I  rt\ 

Ikit+i  -  z*||  +  \\yk+i  -  yitll  +  \\zk+i  -  ^|| 

Assume  that  either  Rk  =  0(||sjt||)  where  sk  =  (xk+1,  zk+x)-(xk,  zk)  or  Rk  =  O(\\F0(xk,  zk)\\), 
where  Fq  is  given  by  (4-2),  and  {(xk,zk)}  converges  to  ( x*,z *)  Q-linearly  Then  {(:rjfc,z*)} 
converges  Q-superlinearly  to  ( x*,z *)  if  and  only  if 


(5.13) 

\\xk+l  -Xfcll  +  ll^fc+1  -  Zk\\ 

Proof.  The  proof  of  the  theorem  follows  by  applying  Theorem  2.1,  Theorem  2.2,  and 
Proposition  4.1,  and  using  (3.5),  (4.8),  and  (4.11).  We  have  used  the  following  fact  concerning 
norms  in  finite  dimensional  spaces.  Let  u  €  Rn  and  v  €  Rm.  Also  let  ||  ||n  be  a  norm  on 
R"»  II IL  a  norm  on  Rm,  and  ||  ||n+m  a  norm  on  Rn+m.  Then  there  exist  positive  constants 
0X  and  02  such  that 


0l(IMIn  +  Him)  ^  ll(U^)lln+m  <  *a(Mn  +  Him)  •  (5.14) 

A  proof  of  (5.14)  can  be  obtained  by  working  with  the  l\  norm  and  the  equivalence 
of  norms  property.  We  also  used  the  fact  that  rk  — >  1  implies  ak  — *  1  (see  Step  3  of 
Algorithm  1)  under  our  assumptions.  This  fact  can  be  found  in  Yamashita  and  Yabe  [14]. 
Finally,  we  have  removed  all  quantities  that  converged  to  zero  and  were  redundant  in  the 
characterization  result. 

□ 

Yamashita  and  Yabe  [14]  gave  a  characterization  which  has  the  flavor  of  (5.12).  However, 
their  assumptions  were  somewhat  more  restrictive. 
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